Skip to content

fix(scout): skip attestation on hosts with no TPM device#2900

Open
s3rj1k wants to merge 1 commit into
NVIDIA:mainfrom
s3rj1k:tpm_attestation
Open

fix(scout): skip attestation on hosts with no TPM device#2900
s3rj1k wants to merge 1 commit into
NVIDIA:mainfrom
s3rj1k:tpm_attestation

Conversation

@s3rj1k

@s3rj1k s3rj1k commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Gate the attestation block on actual TPM presence so a host with no TPM skips AK/EK setup and sends discovery without AttestKeyInfo, which the API accepts when attestation_enabled is false.

Related issues

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@s3rj1k s3rj1k requested a review from a team as a code owner June 25, 2026 21:33
@copy-pr-bot

copy-pr-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@s3rj1k

s3rj1k commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

related: #2703

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e7bb3ffb-cb60-4b1e-89bd-2e6f7f8b4d6f

📥 Commits

Reviewing files that changed from the base of the PR and between 09b1d85 and 810c452.

📒 Files selected for processing (2)
  • crates/scout/src/register.rs
  • crates/scout/src/tpm.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/scout/src/register.rs
  • crates/scout/src/tpm.rs

Summary by CodeRabbit

  • Bug Fixes
    • Improved attestation startup so attestation/measurement now runs only when a TPM device is actually present.
    • When running on non-DPU hosts without a detected TPM, the app logs a warning and skips attestation key setup rather than entering TPM setup.
  • New Features
    • Enhanced TPM device detection, including support for explicit device paths and common kernel TPM nodes.
  • Tests
    • Added coverage for TPM presence and device candidate detection behavior.

Walkthrough

Registration now checks TPM presence before enabling attestation. TPM path handling normalizes device-node inputs, probes kernel TPM nodes, and adds tests for candidate selection and presence checks.

Changes

TPM attestation gating

Layer / File(s) Summary
TPM device probe
crates/scout/src/tpm.rs
Normalizes TPM path inputs into device candidates, probes candidate nodes with try_exists(), and adds unit tests for explicit device paths and fallback kernel nodes.
Attestation gate in register
crates/scout/src/register.rs
Derives do_attestation from host type and TPM presence, skips attestation setup without a TPM device, and reuses the gate for the post-registration attestation block.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main change: skipping attestation when no TPM device is present.
Description check ✅ Passed The description is directly related to the change and accurately summarizes the TPM-gated attestation flow.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@s3rj1k s3rj1k force-pushed the tpm_attestation branch from 0eb4c79 to 09b1d85 Compare June 25, 2026 21:34

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/scout/src/tpm.rs (1)

195-224: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Express these tables through carbide-test-support rather than hand-rolled loops.

Both additions are textbook total-operation tables, yet they reimplement iteration and labeling manually. The house style mandates the shared helpers: tpm_device_candidates_cases is a natural value_scenarios!/check_values table, and tpm_present_probes_explicit_device_path collapses into a check_values table over (input, expected_bool) instead of four discrete assert!s. This keeps failures labeled by case and removes bespoke harness code.

♻️ Illustrative shape for the presence probe
#[test]
fn tpm_present_probes_explicit_device_path() {
    check_values(tpm_present, &[
        ("device:/dev/null", true),
        ("/dev/null", true),
        ("device:/dev/forge_scout_nonexistent_tpm", false),
        ("/dev/forge_scout_nonexistent_tpm", false),
    ]);
}

As per coding guidelines: "Prefer table-driven tests using the carbide-test-support crate with scenarios! ... and value_scenarios! ... Use check_cases / check_values directly when a macro would obscure a table."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/scout/src/tpm.rs` around lines 195 - 224, The two TPM tests are
hand-rolled table-driven checks and should use the shared carbide-test-support
helpers instead of manual loops and asserts. Replace tpm_device_candidates_cases
with a value_scenarios! or equivalent check_values-style table over
tpm_device_candidates, and collapse tpm_present_probes_explicit_device_path into
a check_values table for tpm_present so each case is labeled consistently. Keep
the existing case coverage, but move the iteration and assertion logic into the
shared helpers to match the project’s test style.

Sources: Coding guidelines, Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/scout/src/tpm.rs`:
- Around line 195-224: The two TPM tests are hand-rolled table-driven checks and
should use the shared carbide-test-support helpers instead of manual loops and
asserts. Replace tpm_device_candidates_cases with a value_scenarios! or
equivalent check_values-style table over tpm_device_candidates, and collapse
tpm_present_probes_explicit_device_path into a check_values table for
tpm_present so each case is labeled consistently. Keep the existing case
coverage, but move the iteration and assertion logic into the shared helpers to
match the project’s test style.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3ea14b92-1ec2-489b-915c-1d29f05ec3ca

📥 Commits

Reviewing files that changed from the base of the PR and between 47e304c and 09b1d85.

📒 Files selected for processing (2)
  • crates/scout/src/register.rs
  • crates/scout/src/tpm.rs

@prbinu-nvidia

Copy link
Copy Markdown
Contributor

NICo's current design expects TPM on the host to operate. Seeking input from @ajf on feasibility of removing host-TPM dependency ( i think this would trigger more work on api-server side).

@s3rj1k

s3rj1k commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

i think this would trigger more work on api-server side

works fine if you disable attestation in config, this is exactly a case I see in env I have where TPM is not working correctly (HW thing) but with disabled attestation I can get machines to enroll into READY state

@yoks

yoks commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

This looks like a valid use case, i would like to see NICo to support non TPM hosts if config is set.

There is already config in API which allow non TPM devices (tpm_required). So while having this option and code this leads to not supporting non TPM to look missleading.

I tested similar change in the past, it works fine. Even if scout would ignore TPM it would break if tpm_required set to true.

@yoks

yoks commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

/ok to test 09b1d85

@github-actions

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 288 6 26 105 7 144
machine-validation-runner 751 30 190 274 36 221
machine_validation 751 30 190 274 36 221
machine_validation-aarch64 751 30 190 274 36 221
nvmetal-carbide 751 30 190 274 36 221
TOTAL 3298 126 786 1207 151 1028

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Gate the attestation block on actual TPM presence so a host with no TPM skips AK/EK setup and sends discovery without AttestKeyInfo, which the API accepts when attestation_enabled is false.

Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
@s3rj1k s3rj1k force-pushed the tpm_attestation branch from 09b1d85 to 810c452 Compare June 29, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants